Overview of analyses

Risk of Bias (RoB)

Risk of bias was evaluated across predefined methodological domains and summarized at both the domain and study levels. Distributions of low, unclear, and high risk ratings are presented using summary bar plots, while individual study judgments are visualized with traffic-light plots.

Inter-rater reliability

Inter-rater reliability was assessed using observed agreement, Cohen’s kappa, and prevalence-adjusted bias-adjusted kappa (PABAK). Agreement patterns are further examined using confusion matrices, heatmaps, and domain-specific reliability estimates to identify areas of consistent and inconsistent interpretation.


Risk of Bias (RoB)

Summary bar plot

Risk of bias summary across domains. Bars show proportions of Low, Unclear, and High scores.

Risk of bias summary across domains. Bars show proportions of Low, Unclear, and High scores.

Interpretation. Domains with a higher proportion of Low judgments suggest stronger methodological rigor. Domains dominated by Unclear or High indicate potential limitations (e.g., insufficient reporting or methodological concerns). Overall summarizes study-level risk of bias.

Traffic-light plots

Traffic-light plots show risk-of-bias judgments by study and domain.
Green indicates low risk of bias, yellow indicates unclear risk, and red indicates high risk of bias.


Inter-rater reliability

Overall agreement, Cohen’s kappa, and PABAK

Overall inter-rater reliability metrics.
Metric Value
Observed agreement 0.813
Cohen’s kappa 0.643
PABAK 0.627

Interpretation. Observed agreement is the percentage of identical ratings. Cohen’s kappa corrects for agreement expected by chance. PABAK (prevalence-adjusted, bias-adjusted kappa) is a chance-corrected agreement metric.

Agreement / disagreement / critical disagreement

Agreement, disagreement, and critical disagreement counts and proportions.
Category Count Percent
Agreement 2147 0.813
Disagreement 493 0.187
Critical disagreement (Low vs High) 128 0.048

Interpretation. Critical disagreement captures the most consequential mismatch (opposite ends: Low vs High).

Confusion matrix

Confusion matrix (Reviewer1 × Reviewer2).
High Low Unclear
High 55 67 54
Low 61 692 137
Unclear 49 125 1400
Confusion-matrix heatmap.

Confusion-matrix heatmap.

Interpretation. The heatmap shows where disagreements concentrate. Off-diagonal cells represent mismatches; the corners correspond to critical disagreement.

Agreement metrics by RoB item

Inter-rater reliability by RoB item: observed agreement, Cohen’s kappa, and PABAK.
RoB_item N Observed_agreement Cohens_kappa PABAK
1 - Sequence generation 264 0.981 0.538 0.962
2 - Baseline characteristics 264 0.807 0.605 0.614
4 - Random housing 264 0.981 0.437 0.962
5 - Blinding - experiment 264 0.803 0.145 0.606
6 - Random outcome assessment 264 0.958 0.333 0.917
7 - Blinding - outcome assessment 264 0.883 0.765 0.765
8 - Incomplete outcome data 264 0.682 0.439 0.364
9 - Selective outcome reporting 264 0.769 0.215 0.538
10.1 - Other - pseudoreplication 264 0.557 0.245 0.114
10.2 - Other - procedural equivalence 264 0.712 0.179 0.424

Interpretation. Item-level metrics show which domains are consistently rated and which are harder to judge reliably. Items with lower kappa often reflect ambiguous reporting, subjective criteria, or unbalanced category prevalence.

Observed agreement vs Cohen’s kappa

Observed agreement vs Cohen’s kappa by RoB item. Points far to the right but low on the y-axis indicate high raw agreement driven by category imbalance (prevalence).

Observed agreement vs Cohen’s kappa by RoB item. Points far to the right but low on the y-axis indicate high raw agreement driven by category imbalance (prevalence).

Interpretation. Observed agreement can be high even when kappa is moderate or low, especially when one category dominates (prevalence effect).

Inter-rater agreement by risk-of-bias domain

Inter-rater agreement by RoB item. Cell color represents the proportion of agreement across studies.

Inter-rater agreement by RoB item. Cell color represents the proportion of agreement across studies.

RoB domain key.
1 = Sequence generation;
2 = Baseline characteristics;
4 = Random housing;
5 = Blinding – experiment;
6 = Random outcome assessment;
7 = Blinding – outcome assessment;
8 = Incomplete outcome data;
9 = Selective outcome reporting;
10.1 = Other – pseudoreplication;
10.2 = Other – procedural equivalence.

Session information

## R version 4.5.1 (2025-06-13)
## Platform: aarch64-apple-darwin20
## Running under: macOS Sequoia 15.5
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRblas.0.dylib 
## LAPACK: /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.1
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## time zone: America/Sao_Paulo
## tzcode source: internal
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] ggrepel_0.9.6 knitr_1.50    forcats_1.0.0 scales_1.4.0  ggplot2_4.0.1
## [6] stringr_1.5.2 tidyr_1.3.1   dplyr_1.1.4   readxl_1.4.5 
## 
## loaded via a namespace (and not attached):
##  [1] gtable_0.3.6       jsonlite_2.0.0     compiler_4.5.1     Rcpp_1.1.0        
##  [5] tidyselect_1.2.1   jquerylib_0.1.4    textshaping_1.0.4  systemfonts_1.3.1 
##  [9] yaml_2.3.10        fastmap_1.2.0      R6_2.6.1           labeling_0.4.3    
## [13] generics_0.1.4     tibble_3.3.0       bslib_0.9.0        pillar_1.11.1     
## [17] RColorBrewer_1.1-3 rlang_1.1.6        cachem_1.1.0       stringi_1.8.7     
## [21] xfun_0.53          S7_0.2.0           sass_0.4.10        cli_3.6.5         
## [25] withr_3.0.2        magrittr_2.0.4     digest_0.6.37      grid_4.5.1        
## [29] rstudioapi_0.17.1  lifecycle_1.0.4    vctrs_0.6.5        writexl_1.5.4     
## [33] evaluate_1.0.5     glue_1.8.0         farver_2.1.2       cellranger_1.1.0  
## [37] ragg_1.5.0         prettydoc_0.4.1    rmarkdown_2.29     purrr_1.1.0       
## [41] tools_4.5.1        pkgconfig_2.0.3    htmltools_0.5.8.1